Parallel parsing made practical
نویسندگان
چکیده
The property of local parsability allows to parse inputs through inspecting only a bounded-length string around the current token. This in turn enables the construction of a scalable, data-parallel parsing algorithm, which is presented in this work. Such an algorithm is easily amenable to be automatically generated via a parser generator tool, which was realized, and is also presented in the following. Furthermore, to complete the framework of a parallel input analysis, a parallel scanner can also combined with the parser. To prove the practicality of a parallel lexing and parsing approach, we report the results of the adaptation of JSON and Lua to a form fit for parallel parsing (i.e. an operator-precedence grammar) through simple grammar changes and scanning transformations. The approach is validated with performance figures from both high performance and embedded multicore platforms, obtained analyzing real-world inputs as a test-bench. The results show that our approach matches or dominates the performances of production-grade LR parsers in sequential execution, and achieves significant speedups and good scaling on multicore machines. The work is concluded by a broad and critical survey of the past work on parallel parsing and future directions on the integration with semantic analysis and incremental parsing.
منابع مشابه
Massively Parallel Memory-Based Parsing
This paper discusses a radically new scheme of natural language processing called massively parallel memory-based parsing. Most parsing schemes are rule-based or principle-based which involves extensive serial rule application. Thus, it is a time consuming task which requires a few seconds or even a few minutes to complete the parsing of one sentence. Also, the degree of par-allelism attained b...
متن کاملPAPAGENO: A Parallel Parser Generator for Operator Precedence Grammars
In almost all language processing applications, languages are parsed employing classical algorithms (such as the LR(1) parsers generated by Bison), which are sequential due to their left-to-right state-dependent nature. Although early theoretical studies on parallel parsing algorithms delineated potential speedups on abstract parallel machines using a data-parallel approach, practical developme...
متن کاملA HPar: A Practical Parallel Parser for HTML –Taming HTML Complexities for Parallel Parsing
Parallelizing HTML parsing is challenging due to the complexities of HTML documents and the inherent dependences in its parsing algorithm. As a result, despite numerous studies in parallel parsing, HTML parsing remains sequential today. It forms one of the final barriers for fully parallelizing browser operations to minimize the browser’s response time—an important variable for user experiences...
متن کاملA Parallel Extension of Earley’s Parsing Algorithm
Parsing is the process of deriving structure from a string, and can be used to describe the meaning of the string, and the relationships between its elements. This paper describes two popular parsing algorithms, CKY and Earley. This paper also discusses attempts others have made to distribute the processing workload of the CKY algorithm in a parallel environment. The paper then describes how I ...
متن کاملA Parallel Augmented Context-Free Parsing System For Natural Language Analysis
Parsing efficiency is one of the important issues in building practical natural language processing systems. This paper proposes a design and an implementation of a parallel augmented context-free parsing system for natural language analysis. Natural language grammars are more than context-free, so that unification formalisms are adopted to enforce the linguistic constraints and to transfer the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Sci. Comput. Program.
دوره 112 شماره
صفحات -
تاریخ انتشار 2015